Conversational agents in virtual worlds: Bridging disciplines

نویسندگان

  • George Veletsianos
  • Robert Heller
  • Scott Overmyer
  • Mike Procter
چکیده

This paper examines the effective deployment of conversational agents in virtual worlds from the perspective of researchers/practitioners in cognitive psychology, computing science, learning technologies and engineering. From a cognitive perspective, the major challenge lies in the coordination and management of the various channels of information associated with conversation/ communication and integrating this information with the virtual space of the environment and the belief space of the user. From computing science, the requirements include conversational competency, use of nonverbal cues, animation consistent with affective states, believability, domain competency and user adaptability. From a learning technologies perspective, the challenge is to maximise the considerable affordances provided by conversational avatars in virtual worlds balanced against ecologically valid investigations regarding utility. Finally, the engineering perspective focuses on the technical competency required to implement effective and functional agents, and the associated costs to enable student access. Taken together, the four perspectives draw attention to the quality of the agent–user interaction, how theory, practice and research are closely intertwined, and the multidisciplinary nature of this area with opportunities for cross fertilisation and collaboration. Conversational agents able to interact with users have been deployed in numerous technology-enhanced learning environments, including video games, standalone applications and virtual worlds. Recent developments in virtual world technology have British Journal of Educational Technology Vol 41 No 1 2010 123–140 doi:10.1111/j.1467-8535.2009.01027.x © 2009 The Authors. Journal compilation © 2009 Becta. Published by Blackwell Publishing, 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street, Malden, MA 02148, USA. brought virtual characters (or avatars) at the forefront of research and development, with researchers proposing that knowledge gained from virtual world predecessors (e.g. Multi User Dungeons or text-based multi-user virtual worlds) can assist researchers and designers in implementing contemporary avatars and virtual characters (Mazar & Nolan, 2008). Indeed, virtual world researchers can learn from research on MODs, but they can also learn quite a lot from research and development work in diverse disciplines including online learning, instructional design (ID), educational psychology, psycholinguistics, information systems and human–computer interaction. This is the basic premise of this manuscript. Since research and practice in virtual worlds transpires across numerous disciplines that are varied but interconnected, we can learn a lot from each other by collaborating across disciplinary lines. Thus, in this paper, we hope to provide a space where four researchers/practitioners from four diverse but converging disciplines have come together to answer one question: What are the barriers to the effective deployment and implementation of conversational avatars in virtual worlds (in teaching and learning contexts)? Effective and engaging deployment of virtual characters in educational settings has so far been proven difficult (Dehn & van Mulken, 2000; Gulz & Haake, 2006), partly because collaboration across disciplinary lines seems to be limited. We hope that the ideas presented herein will lead to synergies and collaborative work across disciplines. To ensure consistency across the contributions we situate the question in the context of geriatric avatar–patients in medical simulations. More specifically, the contributions from each discipline will assume that the avatar discussed will be a geriatric avatar with which medical students (represented by avatars themselves) can hold conversations such that they engage in the diagnosis of certain conditions based on the geriatric avatar’s input. The paper proceeds by presenting the perspectives of a cognitive psychologist, a computer scientist/human–computer interaction expert, an engineering practitioner, and a learning technologies researcher. Contributions are then summarised and future research and development work is proposed. It is important to note that within this paper, research and practice are inextricably intertwined. A second point to keep in mind while reading this paper is the fact that references to animated pedagogical agents, conversational agents, chat bots, conversational avatars and virtual characters, represent the same thing: virtual representations embedded in learning environments that serve pedagogical purposes. The variability in the terminology stems partly from lack of collaboration across disciplinary lines and partly from researchers wanting to highlight a specific part of the character. Although we could have chosen a uniform term to use across the manuscript, we decided that using these varied terms highlights the status quo and serves to illustrate the fact that different disciplines use different terminology for essentially similar items. Finally, while the word avatar will probably remind readers of virtual personas of real people, in this manuscript we are concerned with autonomous avatars whose activities (eg, movement, responses, and speech) are controlled by a back-end technology such as an artificial intelligence engine. 124 British Journal of Educational Technology Vol 41 No 1 2010 © 2009 The Authors. Journal compilation © 2009 Becta. Conversational agents in virtual worlds: a cognitive perspective The author of this section was trained as an Experimental Psychologist in memory and language, with an information processing perspective on cognitive function. Hence, much of his focus was on the user’s internal experiences defined and measured using experimental methods. Since this author’s training, theories of cognitive function have either reached outside of the black box to recognise the critical role of the environment in the creation of meaning (ie, situated cognition, see Brown, Collins & Duguid, 1989) or deeper inside the black box to connect with neurons as the biological bases of knowledge (ie, connectionism, see Rumelhart & McClelland, 1986). The author’s interests tend to be outside the box, especially how the environment can be extended to support limited cognitive function in special populations (Heller, Dobbs & Strain, 2009; Waller, 2007). Conversational agents are well poised to support cognitive function at any level. There are, however, barriers to their effective deployment that ironically are related to the strengths they enable. Specifically, conversational agents that are anthropomorphised as avatars in a virtual environment provide multiple channels of information that must be managed and coordinated such that agent–user interactions unfold naturally and efficiently. It is the presence of anomalous information or conflicting evidence between channels that can usurp the cognitive resources intended for a learning experience. In other words, the additional information afforded by a conversational agent avatar in a virtual space should not be a distraction but rather support the cognitive processes being trained. Information that is superfluous to the task at hand needs to be examined for ways of increasing its relevance or possible elimination. To illustrate, we developed a historical figure application of a conversational agent based on Sigmund Freud and evaluated its performance under three conditions: agent without any visual representation, agent with a static image of Freud and agent with an animated image of Freud. We found that the ratings of learner satisfaction and conversational agent performance were highest in the first condition in which no visual or auditory information was provided and significantly different than the other two conditions with visual information (Heller & Procter, 2009). Other researchers have reported this all or nothing effect in the investigation of pedagogical agents (Dirkin, Mishra & Altermatt, 2005). We hypothesised that the visual information in the latter two conditions was poorly integrated with the conversational dialogue and subsequently distracted the user leading to poorer ratings. Thus, the informationally rich environment afforded by a virtual world with an animated avatar needs to be carefully integrated with the conversational agent’s performance as the consequence of mismatches can be significant and may undermine the learning activity. Moreover, the user’s expectations and beliefs must also be managed and coordinated with the learning activity since mismatches at this level can also be a distraction to the task at hand. In managing and coordinating the multiple channels of information afforded by an anthropomorphised conversational agent in a virtual world, it may be useful to examine findings from research on Animated Pedagogical Agents (APAs). In particular, the auditory channel is noteworthy given the report of a modality effect such that Conversational agents in virtual worlds 125 © 2009 The Authors. Journal compilation © 2009 Becta. learning outcomes are improved when APAs use spoken voice as opposed to text only (Craig, Gholson & Driscoll, 2002; Moreno & Mayer, 2002). Moreover, the quality of the voice, as well as its expressive capabilities (Veletsianos, 2009; Veletsianos, Miller & Doering, 2009) can further influence learning outcomes (Mayer, Sobko & Mautone, 2003). Clearly, an age appropriate voice would be required for a geriatric patient. However, using voice places additional constraints on avatar behaviour as voice, needs to be coordinated with lip synchronisation (matching the phonological properties of text with the associated lip movements). Voice without lip synchronisation may be seen as distracting and likely experienced as less satisfying and immersive. Voice also needs to be modulated to reflect the natural speech patterns around syntactic units and pauses, which both filled and unfilled, should be inserted and consistent with their pragmatic function. In addition to managing the auditory properties of voice, the semantic properties of the response need to be synchronised or coordinated with the facial expressions, gaze location and/or body gestures. For instance, if the conversation focuses on the avatars arm, the avatar should look at her arm, perhaps hold it, or examine it. If the arm is causing pain, the facial expression should somehow reflect the degree of discomfort. In particular, facial expressions incongruous with semantic content could be seen as distracting and perhaps incorrectly diagnostic for disorders related to mood. Finally, pragmatic aspects of the conversation also need to be managed and coordinated. Cassell and Thórisson (1999) found that nonverbal behaviours related to turn taking (ie, gaze locations, head movements, body gestures) in the process of a conversation were associated with positive agent evaluations. They used the term Envelope Feedback and found that such feedback was even more important than the emotional expressions provided by the agents. Nonverbal behaviours can also be used to support other pragmatic aspects of communication like humour and indirect requests. Wallis and Norling (2005) argue that pragmatic functions or social intelligence is a key area for development if conversational agents, anthropomorphised or not, are to make any headway in the simulation of social relations. In addition to these internal features ostensibly embedded within the patient avatar, there are the external features both directly reflected in the virtual environment and indirectly in the beliefs of the user about their role in the simulation. The objects and features of the environment must be built to support the simulation. The choice of objects and environmental features should extend and enhance the simulation and should not distract or be irrelevant to the simulation. The intelligence of the patient should recognise these objects and features when referenced in speech or behaviour and respond accordingly. The objects and features are an important part of the shared world between the patient and the user and must be used to support and enhance the

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Periphery of Pogamut: From Bots to Agents and Back Again

Despite virtual characters from 3D videogames – also called bots – seem to be close relatives of intelligent software agents, the mechanisms of agent reasoning are only rarely applied in videogames. Why is this? One possible reason is the incompatibility between representations used by agent decision making systems (DMS) and videogame worlds, as well as different handling of these representatio...

متن کامل

Engaging in a Conversation with Synthetic Characters Along the Virtuality Continuum

During the last decade research groups as well as a number of commercial software developers have started to deploy embodied conversational characters in the user interface especially in those application areas where a close emulation of multimodal human-human communication is needed. Most of these characters have one thing in common: In order to enter the user’s physical world, they need to be...

متن کامل

A Conversational Academic Assistant for the Interaction in Virtual Worlds

The current interest and extension of social networking are rapidly introducing a large number of applications that originate new communication and interaction forms among their users. Social networks and virtual worlds, thus represent a perfect environment for interacting with applications that use multimodal information and are able to adapt to the specific characteristics and preferences of ...

متن کامل

Multimodal Human Machine Interactions in Virtual and Augmented Reality (v-dij-14)

Virtual worlds are developing rapidly over the Internet. They are visited by avatars and staffed with Embodied Conversational Agents (ECAs). An avatar is a representation of a physical person. Each person controls one or several avatars and usually receives feedback from the virtual world on an audio-visual display. Ideally, all senses should be used to feel fully embedded in a virtual world. S...

متن کامل

Mining Social Interaction Data in Virtual Worlds

Virtual worlds and massively multi-player online games are rich sources of information about large-scale teams and groups, offering the tantalizing possibility of harvesting data about group formation, social networks, and network evolution. However these environments lack many of the cues that facilitate natural language processing in other conversational settings and different types of social...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • BJET

دوره 41  شماره 

صفحات  -

تاریخ انتشار 2010